Automated Parallelization of Non-uniform Convolutions on Chip Multiprocessors

نویسندگان

  • Y. Zhang
  • M. Kandemir
  • N. Pitsianis
  • X. Sun
چکیده

This paper introduces an approach for automatic parallelization of unequally-spaced convolutions on chip multiprocessors (CMPs). CMPs are very promising candidates for digital processing in signal and image systems with high throughput and low power consumption, compared to uniprocessor based architectures. As CMPs are emerging and evolving in increasing diversity and complexity, automated parallelization of digital signal processing (DSP) algorithms on CMPs becomes essential in embracing, employing and utilizing these architectures in high-performance signal and image systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Utilization of Cache Area in On-Chip Multiprocessor

On-chip multiprocessor can be an alternative to the wide-issue superscalar processor approach which is currently the mainstream to exploit the increasing number of transistors on a silicon chip. Utilization of the cache, especially for the remote data is important in the system using such on-chip multiprocessors since the ratio of the oo-chip and the on-chip memory access latencies is higher th...

متن کامل

Performance Analysis of Non-Uniform Cache Architecture Policies for Chip-Multiprocessors Using the Parsec v2.0 Benchmark Suite

Non-Uniform Cache Architectures (NUCA) have been proposed as a solution to overcome wire delays that will dominate on-chip latencies in Chip Multiprocessor designs in the near future. This novel means of organization divides the total memory area into a set of banks that provides non-uniform access latencies and thus faster access to those banks that are close to the processor. A NUCA model can...

متن کامل

Pii: S0141-9331(00)00094-6

On-chip multiprocessor can be an alternative to the wide-issue superscalar processor approach which is currently the mainstream to exploit the increasing number of transistors on a silicon chip. Utilization of the cache, especially for the remote data is important in the system using such on-chip multiprocessors since the ratio of the off-chip and the on-chip memory access latencies is higher t...

متن کامل

Automatic Parallelization for Non-cache Coherent Multiprocessors

Although much work has been done on parallelizing compilers for cache coherent shared memory multiprocessors and message-passing multiprocessors, there is relatively little research on parallelizing compilers for noncache coherent multiprocessors with global address space. In this paper, we present a preliminary study on automatic parallelization for the Cray T3D, a commercial scalable machine ...

متن کامل

The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization

As we look to the future, and the prospect of a billion transistors on a chip, it seems inevitable that microprocessors will exploit having multiple parallel threads. To achieve the full potential of these “single-chip multiprocessors,” however, we must find a way to parallelize non-numeric applications. Unfortunately, compilers have had little success in parallelizing non-numeric codes due to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009